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Abstract 

We investigate the properties of an autoassociative network of threshold- 
linear units whose synaptic connectivity is spatially structured and asym- 
metric. Since the methods of equilibrium statistical mechanics cannot be 
applied to such a network due to the lack of a Hamiltonian, we approach 
the problem through a signal-to- noise analysis, that we adapt to spatially 
organized networks. The conditions are analyzed for the appearance of 
stable, spatially non-uniform profiles of activity with large overlaps with 
one of the stored patterns. It is also shown, with simulations and an- 
alytic results, that the storage capacity does not decrease much when 
the connectivity of the network becomes short range. In addition, the 
method used here enables us to calculate exactly the storage capacity of 
a randomly connected network with arbitrary degree of dilution. 

1 Introduction 

Considerable theoretical and experimental evidence supports the notion that 
cortical networks have been specialized in evolution to serve a memory func- 
tion. In particular the hippocampus, sitting at the top of the cortical hierarchy 
[T3] is thought to approximate a "pure" associative memory system - in the 
formation of e.g. spatial memories in rats or episodic memories in humans j3U] - 
in the sense that the activity of individual units is only meaningful in relation to 
previous activity, and not in relation to the physical position of the units in the 
tissue At the core of the hippocampus, information from different sources 
is associated together within the CA3 network, and the pattern of activity cor- 
responding to a memory item can be conceived of as an arbitrary, randomly 
generated compressed representation. In neocortical areas sitting lower in the 
hierarchy, instead, memory operations still reflect the topography informing 
synaptic connections, with the result that the activity of a unit relates also to 
its position in the tissue. One can identify of course many additional differ- 
ences between memory storage in the hippocampus and in the neocortex, e.g. 
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differences in time scales, but we focus here on two simple types of model, that 
emphasize only the relationship between memory function and the spatial orga- 
nization (i.e., the geometry) of the connectivity. Both types of model implement 
an autoassociative network with recurrent collateral connections, whose efficacy 
has been modified during a training phase by associative plasticity (a model 
Hebbian " learning rule" ) . 

In the first, 'hippocampus' type of model it is assumed that episodic mem- 
ories or charts of the local environment have been stored as patterns of 
neuronal activity distributed throughout a network. Units in the network are 
labeled with an index i, i = 1 . . . N, but the connectivity between the units, or 
the probability that two units are connected, does not depend on their indexes. 
The connectivity can in fact be complete, as in the Hopfield model |16| and in 
its graded-response variants [52"]. or it can be sparse, but still independent of 
the index, as in (2H| or in the highly diluted limit considered by 52- This type 
of model has been thoroughly analyzed in terms of its storage capacity, yielding 
a relation between the maximum number of pattern p c that can be turned into 
dynamical attractors, i.e. that can be associatively retrieved, and the number 
C of connections per receiving unit. Typically the relationship includes, as the 
only other crucial parameter, the sparseness of firing a and it takes the 
form 

C 

Pc-—, — frr^- W 
alog(l/a) 

Note that the storage capacity calculation has been extended to the chart model 
introduced by |25| . leading to an estimate, parallel to Eq^ of the maximum 
number of charts that can be stored given their sparsity a and the number of 
connections C [H] . Although in a given chart units are arranged topographically 
by their spatial selectivity, such an arrangement is different from chart to chart 
and unrelated to any absolute " index" - effectively there is a chart-specific index, 
randomly reshuffled in each chart. Correspondingly, there is no absolute geo- 
metrical structure to the connectivity, even though connection weights reflect 
the storage of multiple charts. 

Typically, the plasticity process, i.e. the modification of connection weights 
that leads to the formation of attractors, is not described in detail by mathemat- 
ical models of the autoassociative variety, but it is a widely held hypothesis that 
in the hippocampal CA3 network attractors are formed by tuning the synaptic 
efficacy of its recurrent collaterals with synaptic plasticity mechanisms akin to 
LTP and LTD |19| . Very similar mechanisms could operate in storing memory 
patterns in the neocortex. Indeed, several reports on the observation of synap- 
tic plasticity in the isocortex contribute to this idea ^0 . Thus memory 
storage could be mediated by the same processes in the neocortex as in the 
hippocampus. 

Yet, the first type of models reviewed above is inappropriate to analyze 
memory retrieval in the cerebral cortex, because there one has to take geometry 
into account. Both the local neocortical patch and CA3, in terms of the degree of 
recurrent connectivity, can be thought of as networks of extensively but sparsely 



2 



connected pyramidal cells 0, in the sense that each pyramidal cells receives 
inputs from thousands of its neighbors, but those represent only a fraction of 
the total neighbor population. While in CA3, on the other hand, the probability 
of existence of a synapse between two pyramidal cells does not change much 
with their physical distance in the neocortex on the contrary it does depend 
on their distance. For instance one study JS] shows that in layers II and III 
of mice visual cortex the probability of connection falls off from 50%-80% for 
directly adjacent neurons to 0%-15% at a distance of 500^to. A similar distance 
dependence and spatially organized pattern of connectivity could be observed 
in other isocortical areas, and this fact is what is not considered in the type of 
models mentioned above, which makes them inappropriate for neocortex. 

Investigating a simple associative network model with a geometric structure 
informing its connectivity is the purpose of this article. We thus introduce and 
analyze an autoassociative network which is comprised of threshold-linear units 
and includes a geometrical organization of neuronal connectivity, meant as a 
simplistic model of the type of organization of connections observed in the cor- 
tex. The units in the model are therefore endowed with an index, that refers to 
their physical position on an underlying substrate. For simplicity, we consider 
periodic boundary conditions in either ID (a ring) or 2D (a torus). Connec- 
tions are taken to be denser between units close on the ring or the torus than 
between distant units. Such connectivity structures have been considered ex- 
tensively in neural networks models of, for example, orientation selectivity 7 or 
head direction cells and have been shown to lead to localized activity states 
('bumps'), corresponding to a specific orientation or head direction. These mod- 
els do not include an associative memory function. In addition to these models, 
there have been studies on networks with non-geometric connectivities but spa- 
tial correlations in the stored patterns [20]. Here, we consider both geometry 
in the connectivity and associative storage on the connection weights, leading 
to network states than can be localized, or correlated to randomly distributed 
activity patterns previously stored on the weights, or both. 

It is worth noting that it is not straightforward to apply to such networks en- 
dowed with geometrical connectivity the methods from statistical physics which 
were originally borrowed to solve models like the Hopfield model These 
methods are based on the existence of a Hamiltonian describing the steady, 
asynchronous firing states of the system, which leads to a free-energy function 
of a limited number of order parameters. One condition for the existence of a 
Hamiltonian is that interactions between pair of units be symmetric, i.e. the ef- 
fect of a pre-synaptic unit on a post-synaptic unit be exactly reciprocated. This 
obviously presupposes symmetric connectivity (and identical weights in the two 
directions); although it could also be taken to be a good first approximation 
to networks with asymmetric connectivity 1 . Further, the standard procedure 

1 It is worth- nothing that in a large variety of networks with graded response units, the 
symmetric connectivity is just a necessary condition for the existence of a Hamiltonian but not 
sufficient. It ensures detailed balance and the existence of a Hamiltonian for models with, for 
instance, binary neurons or monotonically increasing analog response functions, but it does 
not suffice in a variety of other models [T7| 
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requires that all variables that bear the index of individual units be averaged 
out, to obtain a free-energy that depends solely on non-local, extensive quanti- 
ties, which can be assumed in turn to take narrowly constrained values. As we 
shall discuss elsewhere, this self-averaging property does not apply to networks 
with geometric connectivity. To address these problems we develop an improved 
version of the 'self-consistent signal-to-noise analysis' |27j. 

The paper is organized in the following way. In the second section, we 
describe a model of an associative network of threshold-linear model neurons 
with an arbitrary geometrical (and sparse) connectivity. We then derive the 
self-consistent equations for the order parameters that we define. We refer 
to these equations as the mean-field equations. In the third section, we use 
these equations to calculate the storage capacity of a network without geometry. 
We recover the results previously found by using the replica method for an 
'extremely diluted' network and also calculate the leading deviations from 
this limit when the connectivity is less sparse. This exercise yields insight useful 
later in analyzing the geometrical model. In the fourth section, we study a 
one-dimensional model in which we consider a probability distribution for the 
connectivity. We study the behavior of the storage capacity and the shape 
of the profiles of activity for such a network, via analytical arguments and 
computer simulations. Conclusions are summarized at the end, while details of 
the calculation are provided in 3 Appendices. 

2 Methods 

2.1 Threshold-linear model 

Consider a network of N units, in which the level of activity of unit i is repre- 
sented by a variable Vi > 0. This variable can be taken to represent the firing 
rate of the neuron averaged over a short time window. We assume that each 
unit receives C inputs from the other units in the network. The thermodynamic 
limit N — > oo and C — ► oo is assumed. The specific covariance 'Hebbian' learn- 
ing rule we consider prescribes that the synaptic weight between units i and j 
be given as: 



where ryf represents the activity of unit i in memory pattern \x and Cy is a binary 
variable equal to 1 if there is a connection running from neuron j to neuron i, 
and otherwise. Each r/^ is taken to be a 'quenched variable', i.e. a given 
parameter, drawn independently from a distribution p(rj), with the constraints 
V ^ 0, (77) = (?7 2 ) = a, where () stands for the average over the distribution p(rj). 
Here we concentrate on the binary coding scheme p(rj) = aS(r]— 1) + (l — a)6(rj), 
but the calculation can be carried out for any probability distribution. As in 
one of the first extensions of the Hopfield model |37], we thus allow for the mean 




(2) 



4 



activity a of the patterns to differ from the value a = 1/2 of the original model 
|32|. We further assume that the input (local field) to unit i takes the form: 



hi = JijVi + bl—^2vj\, (3) 

where the first term enables the memories encoded in the weights to determine 
the dynamics; the second term is unrelated to the memory patterns, but is 
designed to regulate the activity of the network, so that at any moment in time 
x = jr J2i v i an d y = ~k Si v i both approach the prescribed value a (which 
then parametrizes the sparsity of the network activity |33j). The activity of 
each unit is determined by its input through a threshold-linear function: 

Vi = F[hi) = g(hi ~ T thr )Q(hi - T thr ) (4) 

where T t h r is a threshold below which the input elicits no output, g is a gain 
parameter, and 0(...) the Heaviside step function. The exact details of the 
updating rule are not specified further, here, because they do not affect the 
steady states of the dynamics, and we take "fast noise" levels to be vanishingly 
small, T — > 0. Discussions about the biological plausibility of this model for 
networks of pyramidal cells can be found in |33l H] , and will not be repeated 
here. 

In order to analyze this network, we first define a set of order parameters 
{m^}, with fx = 1 . . .p;i = 1 . . . N, which we call the local overlaps, as follows: 

3 

This is a natural choice for quantities that measure retrieval while also tak- 
ing into account the spatial structure of the network, and hence the position 
dependence of the activity. 

If we rewrite the local field hi defined above in terms of these order param- 
eters we have: 

hi = } {rft/a— 1) mf* ~ c«a(l/a - + b (x) (6) 

in which a = p/C is the storage load, and we have carried out the average 
Y^uiVi l a ~ I) 2 — .P(l/ a ~ !)■ We will use this expression for the local field in 
the next section. 



2.2 Retrieval states and the mean-field equations 

A pattern fi is said to be retrieved if J2i m i = O(N). Without loss of generality, 
we suppose that the first pattern is the retrieved one and therefore m\ <C m\ 
for i/^ 1 and any i. When one pattern is retrieved, the local field to each unit 
can be decomposed into two terms. One is the signal, which is in the direction 
of keeping the network in a state with large overlap with the retrieved pattern. 
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The second term, which can be called noise, contributes random interference. In 
EqEJthe signal is nothing but the p = 1 term in the sum on the r.h.s., whereas 
the noise is the rest. The idea is to calculate these terms as a function of the 
local overlaps with the retrieved pattern. In other words we wish to express the 
r.h.s of EqEl solely as a function of mi = mj and rjj. If we do so, we can then 
express the activity of each unit as a function of m t , and by inserting it in the 
definition of local overlaps, we will be able to find a self consistent equation for 
the local overlap with the first pattern. 

To proceed further, we define two more local order parameters pi and ji 
through the equation: 

^2{Vi/ a ~ l ) m i =PiZ + 7i«i: (7) 

where we take z to have quenched-averaged standard deviation unity. We then 
single out a generic pattern p from the sum over non-retrieved patterns, writing: 

^(#-^ = ^+7^, (8) 

and noting that, to leading order in l/p, pf ~ pi and 7,f ~ 7, are expected to 
be independent of p. With this, we can write the activity of the network as: 

Vi = F[( f )?/a-l)ml+( ! )f/a-l)mf v i -c ii a(l/a-l)+b{x)-T thr ](9) 

from which Vi can be found self consistently, as in [H|: 

v t ~ G[( V l/a - l)m\ + {rfila - l)m? + p?z + b(x) - T thr ] (10) 

Assuming that I\ = 7^ — cua(\/a — 1) < 1/g 2 the function G[x] takes the 
following form for a threshold-linear unit: 

G[x] = -?—xQ(x). (11) 

1 -pr 

In the case of a non-geometric network, as discussed by Shiino and Yamana 
[25] this factor T equals minus the Onsager reaction term, when one treats the 
network in the TAP equation framework. 

Now we expand the r.h.s. of the above equation for vt up to the linear term 
in and insert the result in EqUJ to get: 

3 

where: 

L > = h ? Cij { ^ /a ~ l)G[ ^ ,a - vrf + fi z + b{x) - Tthr] 
3 

K = ^{rf/a-lfG'Uj/a l)m) + p»z + b(x) T thr }. 



2 We shall see later that this assumption is valid, at least when one deals with diluted 
networks or very low storage loads. 
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Solving the above equation for m?,n ^ 1 and using it in the expression defin- 
ing rrii = m\ we get the following self consistent (mean-field) equations (see 
Appendix I): 

Ti = aT ipu 



,,2 



c 

3 

+ 

Dz 



((^ - l) mj + b{x) - T thr - PjZ y (1 - gT 3 )- 2 ) (13) 



iV 



Dz (C± - l) mj + b(x) - T thr - Pj z) (1 - gTj)- 1 
•4- 

E( / £z ((^ - l)m, + b(x) - Tthr ~ Pjz) (1 - ffr 



— * / 2 

where Z?z = 6 ^ 27r and the superscript + indicates that the integration has 
to be carried out in the range where (^ — l)m, + 6(a;) — T t hr > P%z. The new 
order parameters tj), p have been defined in the derivation of these equations in 
Appendix I. 



3 Diluted network without structure 

To proceed further let us first consider the case in which there is no geometry 
and the ey's are randomly generated with probability Pr{cij = 1} = C/N. In 
this case, in the definition of ip (see Eg 11-81 in Appendix I) for the first sum on 
the r.h.s we have: 

£*« c tf - i^c«cy(7 ? f/a-l) 2 G'[(ry//a-l)m i 1 +pfz+&( : r)-T tfcr ].(14) 
i i 

If we replace the sum with an average over the distribution of {cy } and {r]i} and 
neglect in this averaging any correlation between the position of the unit and its 
activity (this assumption will have to be reviewed, of course, in the geometric 
case) 

E K u c U = jf < !) 2 >v< G'\j] >, hZ = -jf- < G'\j] >,, 2 (15) 

1 



and, in fact, one notes that it can be written, to any order in n 



E K u <* -%^< G '\3\ >C = %W (16) 



Li 
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where we have defined the quantity: 

gTa 



Sl = T < G'\j] > v , z = < J Dz> 

that we assume to self-average, i.e. not to depend on the index j. 
Using the above expression yields the steady state equations: 

i> = ^(n + n 2 + r> 3 + ...) 

r = aTQip 

2 _ ( f)T \ 2 , o / , N 1 2 



P = M(T^f)J (^ + -^)x (18) 



Dz ((2 - l)m + &(a;) - T thr - pz) ) 

y Dz(*7/a - 1) ((2 - l)m + - IW - p*)) 
Y^(J Dz^-l)m + b(x)-T thr - pzj). 



The fact that ?/> vanishes in the limit of C/N — ► can be understood intu- 
itively. The order parameter tp is nothing but the contribution of the activity 
reverberating in the loops of the network. When one considers a highly diluted 
network, the number of such loops becomes negligible, and they do not con- 
tribute to network dynamics. Thus ip and T vanish in this limit. This also 
makes the above inequality I\ < l/j a valid assumption. T essentially mea- 
sures the effect of the activity of each unit on itself, after it has reverberated 
through the network, and this effect becomes negligible when one deals with an 
extremely diluted network. 

We can then define the new variables r = m/p and w — [b (x) — m — Tthr]/ ' P 
and the following integrals, which are functions of r and w, as in |82| : 



r Dz(w+ r -?L-z)) 
rT a J a 



Ai = A 2 -(J + Dz) (19) 

A 3 = (J + Dz(w+^--z) 2 ). 

By using this notation, as shown in Appendix II, one finds that: 

n = l- (A 1 /A 2 ) (20) 
and the remaining steady state equations can be reduced to: 

c /(2-fi)ir 
n \(i-n) 2 



EiM = At-(l + ^(^ r ^))aA 3 = (21) 



i no 

MrM = (^-^7^)^2 = (22) 
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which extend and interpolate the results of to finite values of C/N. 

The first equation above appears as a closed curve in the (u>, r), plane, which 
shrinks in size when one increases a and then disappears; whereas the second 
equation is an almost straight curve, which for a certain range of g intersects 
twice with the closed curve above. Since for a given value of a such that the first 
equation is satisfied, there always exists a value for g that satisfies the second 
equation, the storage capacity is the value of a for which the closed curve shrinks 
to a point. We treat g as a free parameter, because it can be easily changed in 
a network by mechanisms like multiplicative inhibition, if required in order to 
approach the optimal storage load. 
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Figure 1: Storage Capacity vs. a for C/N = (full curve), C/N — 0.05 (dashed 
line) and C/N = 1 (dotted line). 

In the limit of extreme dilution, i.e. C/N — * 0, fi does not contribute to the 
equation for the storage capacity. The result of calculating the storage capacity 
as a function of the sparseness of the coding is shown in FigQ] (the full curve). 
For other values of C/N the contribution from should be taken into account, 
which for small C/N ^ results in deviations from the storage capacity of a 
highly diluted network. An example is illustrated in in Fig^Jfor C/N — 0.05. 
It is clear that, at least for small a, a network with 5% connectivity can be 
considered as highly diluted, in the sense that for sparse patterns of activity, 
the effect of loops - which produces the difference between and A\ - becomes 
unimportant. 

4 The network with geometrical connectivity 

In this section we study the fixed points equations of the network, when the 
probability of the existence of a connection between two units depends on their 
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distance, as opposed to the previous case. An interesting example, in one di- 
mension, is a network with a Gaussian connectivity probability distribution: 

C -(i-i) 2 

Pr{dj = 1} = e + Baseline. (23) 

V27T(T 2 

The baseline is a correction that has to be considered for a oc N, to ensure 
that when the ring cannot be taken to be infinite, we still have that the sum 

v./vk L} c. 

In this geometrical case there can be spatially non-uniform solutions to the 
steady state equations. We have to analyze then two related issues, both emerg- 
ing with decreasing a, as the network approaches a more local connectivity: the 
appearance of non-uniform solutions to the equations, and whether the storage 
capacity of the network decreases. 

4.1 Appearance of spatially non-uniform activity 

When a is large one may expect the solutions of the equations that we have 
previously found to be again independent of space. It can be seen from the 
simulations that this is actually the case. Indeed when one measures the local 
overlap with different patterns in a steady state, one can observe that for values 
of a larger than a critical value a c the solutions do not show spatial depen- 
dence; but, as soon as a becomes smaller than er c , the local overlap begins to 
display some spatial dependence, which increases by further decreasing a. This 
is illustrated in FigEl In this subsection we aim to study this phenomenon 
analytically. 

The simulations indicate that close to the transition the spatial dependence 
of the overlap takes the form of a cosine, whereas for lower a values it approaches 
a gaussian. One can easily check analytically, however, that considering a gaus- 
sian ansatz for m,, and another gaussian for, say, pi, does not solve the mean- 
field equations, Eqs^J which do not in fact appear to admit any simple curve 
as a solution. This led us to develop approximate treatments that circumvent 
the lack of a closed-form spatially-dependent solution. 

From what we see in the simulations we assume that the transition to the 
spatially non- uniform solution is smooth (second order). We further assume 
that C/a c is small so that to a first approximation, in order to determine the 
critical point, we can neglect the effect of loops, i.e. of ip an d T. This assump- 
tion may well be inappropriate (for small g, for instance, as we shall see later) 
but we hypothesize that its effect will not distort a qualitative picture of the 
phenomenon too much. This can be verified by simulations. Using this ansatz, 
we now write the solutions of the fixed point equations as follows: 

rrii — m° + Srrn, | Srrii |<C m° 
Pi = P° + Sp l , ip, |« /)° 

T = T° + ST | ST |< T° 

where m° and p° are the uniform parts of the solutions, which we take to be the 
solutions of the fixed point equations for a = oo; and Srrii and Spi are the small 
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Figure 2: The dependence of the local overlap on a. Results are from simulating 
a network with N = 6400, C = 320, p = 32, g = 0.7, a = 0.2 and cr = 1900 (a), 
cr = 1700 (b), a = 1500 (c), a = 1000 (d), a = 700 (e) and a = 500 (f). In each 
panel, the upper fluctuating curve is the local overlap with the retrieved pattern 
and the lower one the local overlap with one of the non retrieved patterns. The 
black line inside the local overlap with the retrieved pattern is a smooth version 
of the local overlap, calculated by averaging over 100 nearby units for each point. 
A smooth change of the local overlap from a uniform shape to the first Fourier 
mode is clear. 



deviations from uniformity. In the same way T° is the value of the threshold 
which sets the mean activity x = a for the uniform solution and T° + ST is 
the (uniform) threshold necessary to keep x = a in the presence of non-uniform 
terms Snii and 8 pi. It is worth noting that a more accurate approach would be 
to use the values of mP and p° calculated for a just above the transition value 
g c . These values would be different from those at a — oo, as the effects of loops 
may become important close to the transition to non-uniform solutions; but as 
stated before we provisionally neglect this inaccuracy. Using these assumptions 
and expanding the mean-field equation around the uniform solutions one obtains 
equations for the fluctuations. These equations in the continuum limit 3 are of 

3 The continuum limit can be approached by averaging both sides of the above equations 
over a length scale A which is large enough to effectively sample the distributions of both {rjj } 
and {cij}. 
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the following form: 

Sm(r) = Jdr'{(^c(ry)- b j±ym(r') + ^c(ry)-^6p(r')}{24) 
Sp(r) = y n dr'{^c(r,r')-^)^(r') + (^c(r,r')-^)5p(r')}(25) 

where the coefficients are defined in Appendix III. 

Next we take the Fourier transform of the two sides of the above equations, 
to get: 

ane 2 +6 21 ^z ai2e 2 -fe 12 ^z ^ ^ <5m (fc) 



y a 2 ie 2 _f, 21 _^ i_ a22e 2 +f, 22 _^i j \ " ' 

The above equations for <5m and bp have a non-trivial solution if and only 
if the determinant of the matrix of coefficients becomes zero. For k = the 
matrix (which includes the b terms) is the same as that which determines the 
stability of the uniform solution in the network without geometry. On the other 
hand, when k 7^ the matrix does not include the b terms, and the vanishing of 
its determinant yields the critical value of a, as it signals the instability of the 
uniform solution towards the appearance of a non-uniform Fourier mode. For 
large a, the determinant of the coefficient matrix approaches 1, and it decreases 
with decreasing a. For those values of the other parameters (besides a) for 
which the determinant of the matrix is negative at a = 0, there would be a 
critical value for a at which the determinant becomes zero, and therefore a 
transition occurs. It is clear that the first Fourier mode i.e. k = ^ is the one 
that appears first. This is what one actually observes in simulations. 




15 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 
Sigma/N 

Figure 3: The determinant of the coefficients versus the a/N for g = 0.7 (full 
curve), g = 0.6 (dashed curve) and g — 0.5 (dotted curve). The other parameters 
are a = 0.2 and p/C = 0.1. 
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In FigEJ (full line) we have plotted the determinant of the coefficient matrix 
as a function of a/N for the first Fourier mode. This is done for three values of 
g = 0.7,0.6 and 0.5. We deduce from this graph that the value of cr c increases 
by increasing g. Note however that the first Fourier mode, a cosine, is not 
strictly speaking a solution of Eas ll3l for any a value below a c . The discrepancy 
becomes more evident, see FigEl as a decreases, and the solution eventually 
becomes localized, i.e. it takes non-zero values for the order parameters only on 
a limited fraction of the ring. In that regime solutions look like gaussian curves, 
but they are not exact gaussians. 

It is important to realize that the b terms in the above equations come from 
the effective threshold which results from our uniform inhibitory term, and fixes 
the mean activity of the network at x = a. This is important because if there 
were no b terms, the condition for the instability of the uniform solution for the 
non-geometric network and the condition for the appearance of the non-uniform 
solutions at finite a would have been the same. Therefore without an activity- 
dependent threshold, stable retrieval in the non-geometric network would have 
implied no spatially non-uniform solution in its geometric counterpart. 

4.2 Storage capacity 

The storage capacity of the geometric network differs from that of the non- 
geometric one for two main reasons. The first one is that by changing the 
geometry of the connectivity one changes the distribution of the connectivity 
loops that contribute to noise reverberation in the network. Effectively, lowering 
a increases the clustering of the nodes in the network |38| and the number of 
its loops, leading to a decrease in storage capacity for the same reason that 
the capacity decreases (if expressed as a = p/C) when a diluted non-geometric 
network reaches a denser connectivity. 

The second one is the non-uniformity of the spatial distribution of the signal 
and noise to the units i.e. the spatial dependence of m(r) and p(r). Effectively, 
the connections originating from the less active units at the flanks of the spatial 
profile are used less, or even unused if the solution is localized and those units 
remain inactive, and the network becomes roughly equivalent to one with a 
lower C value. 

These two effects are correlated with each other, but one can get an estimate 
of how they affect the storage capacity by first considering them separately. 
In other words, one can consider a network with structured connectivity and 
calculate its storage capacity if the profile of activity has no spatial dependence. 
Although we know from the previous sections that this kind of uniform solution 
would not be the stable state of the network, we can calculate this way the 
effect of the change in the distribution of the loops on the storage capacity. 
On the other hand, one can consider a network without geometric connectivity, 
but with a spatially non-uniform activity, although again this would not be the 
stable solution. We will follow this approach in the coming subsections. 
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4.2.1 The effect of the loops 



We start by considering a uniform solution to the mean-field equations EaslT^I 
while considering a gaussian connectivity. The uniform solution is of course 
not stable nor relevant when a < a c : as discussed previously (see Fig|21 and 
section 4.1) the actual form of the activity is quite close to a cosine immediately 
below the tr c ; and then slowly transforms to a localized quasi-gaussian activity 
(Fijdf) by decreasing a. Although we know that the uniform solution is not 
stable below a c , we can still calculate a reference storage capacity by inserting 
it into the mean-field equations. Then we can compare the results with the 
simulations and assess their correspondence. In the next section we shall see 
that this procedure, even though very crude, gives us an estimate of the true 
storage capacity which is comparable with the simulations. 

By considering the uniform solution to the equations, we have: 

C °° (r-£) 1 

V27TCT 2 ^ Vn + 1 



and hence: 



/ dr"{2c(r - r>(r - r") + ^(r r") 2 } = -=£= £ -^= fi " (27) 

Using the above, we can write the equation for the storage capacity as: 

A 2 - f 1 + - 7 £= Y -%±L=(l n ) = (28) 

Using this equation and evaluating the series numerically, we have calculated 
the storage capacity for various values of a in a network of C — 320 and a = 0.2, 
as shown in FigEl This graph indicates that the storage capacity decreases with 
a but not by much. The assumption of considering the uniform solution implies 
that such a decrease is due solely to the increased relevance, as a decreases, 
of closed loops, and therefore of increased noise reverberation. Although this 
analysis does not take into account additional effects due to the emergence of 
non-uniform solutions, we shall see in the next section that the capacity decrease 
in the graph is quite comparable with the results of the simulations, also for 
values of a which lead to non-uniform steady states. 



4.2.2 The effect of the non-uniform solution 

We can have an idea of how the form of the solution affects the storage capacity 
simply by considering an ansatz on the form of p{r) and m(r) which depends 
on a finite set of parameters {Ai, A2, • • • , Aj,}. We also assume, to start with, 
that the effects of loops are negligible, so that we can set ip(r, r') = and 
r(r) = 0. Loops are of course not negligible e.g. when a is small, but we want 
now to isolate the effect of the non-uniformity from that of loops, which has been 
estimated in the previous part using the uniform solution. In other words by 
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this procedure we actually consider a network without loops and calculate the 
storage capacity corresponding to a parametrized non-uniform activity profile, 
although we know it is unstable in a structure-less network. 

With this ansatz and integrating over r on both the left and right hand sides 
of the equations describing p(r) and m(r), i.e. Eas ll3l we can write after some 
manipulation: 

fdrl 2 (r) \ 2 fdrUr) , , 

a v j r m , = ( 29 ) 



Tq J drm(r) J J dr[p(r)] 2 

J drm(r) 

x = WW) 



drh(r) (30) 



h{r) = ((^-1) [ + Dz((^- l)m(r)+b(x)-T thr - p(r)z)) 



1-M-) = . / Dz((^-l)m(r)+b(x)-T thr - p{r)zf) 
h(r) = (I Dz{{ r ^--l)m{r)+b{x)-T thr -p{r)z)) 



In practice, for any given form of the functions m(r) and p(r), one can solve 
Eg 1301 to get b(x) — Tthr by setting its r.h.s to x = a. Then one can use this 
to evaluate the integrals appearing in Eq|2!5]and find the highest value of a for 
which a solution for this equation exists in the {Ai, A2, • • • , Afc} space. 

As an example, let us consider a gaussian form, which simulations indicate 
is a reasonable approximation, even if not an exact solution, in the localized, 
or low a, regime. In other words, let us assume that m(r) = moexp(—r 2 /2l 2 ) 
and p(r) = p exp(—r 2 /2l 2 ). By fixing I and following the procedure described 
above one finds that for a given value of a, Ea l29l appears as a closed curve in the 
(mo, po) plane, that shrinks in size with increasing a, analogously to Eg 1211 The 
value of a for which this closed curve disappears defines the storage capacity 
at constant width a{l). Of course the gaussian ansatz is an approximation 
which needs to be considered carefully. An example of gaussian-like activity is 
FigEJf- Unfortunately the profile of activity below a c does not take an analytical 
form. Still, a gaussian fit seems to be a good approximation, even though it is 
not the solution (note that in FigEJf the activity goes to zero outside a finite 
radius, hence it cannot be a gaussain). The accuracy of the gaussian ansatz 
can be checked by comparing the resulting storage capacity with that of the 
simulations. As we shall discuss in the next section, our procedure leads to a 
reasonable agreement with the simulations. 

Fig0] shows the result of following the above procedure for calculating a(l) 
for a network of N units and a = 0.2. The decrease in the storage capacity for 
more localized solutions can be seen. This decrease is solely due to the non- 
uniformity of the solution, and it has no contribution from the geometry of the 
connectivity, since the effect of the connectivity decouples when one integrates 
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over space in the mean- field equations to get Eq[2Hl In other words, there is no 
dependence on the connectivity probability distribution c(r, r') in this equation. 
In Fig 0| we have indicated, on the l/N axis, the widths of the profiles seen 




Figure 4: The dependence of the storage capacity, for a = 0.2, on l/N, where I 
is the width of a gaussian solution. Note the correspondence with the a values 
used in Fig[5] see text. 

in the simulations with the 6 values of a used in FigEI We have run extensive 
simulations with those 6 values of a but, unlike those in Fig|21 with a close to a c 
and g to the optimal gain value, as estimated from the simulations themselves. 
To calculate the values of I corresponding to each a and close to capacity we have 
used equation Eq. 1321 we first calculate the average value of q across simulations 
with each tr, by using its definition Eg 1311 and then find the value I which solves 
Eg 1321 Please note that the gaussian fit is a reaonable approximation, only in 
the localized regime e.g. Fig[!J and f, but not really for apparently flat solutions 
like those in Fig|2K or even cosine-like ones like those in Fig^. Moreover, finite 
size effects smooth the otherwise sharp transition at a c . Finally, FigJ5] was 
produced by running simulations at low a and fixed g, whereas we now set 
these parameters at the storage capacity limit. All these effects cumulate to 
make the estimated I values smaller than what one would have predicted from 
visually inspecting Fig|2k--c (the discrepancy is milder in the localized regime). 
For example, looking at point a in FigQJone sees I = IAN for a — 1900, while 
from the flat-looking solution of Fig[3]one might have expected I — > oo! 

One can see that such widths change significantly, and correspondingly there 
is a significant estimated capacity decrease, for the upper 3 a values, which are 
relatively clustered around a c . For the lower 3 a values, even though they 
cover a much larger range on a log scale, the resulting profile widths change 
less (I is roughly proportional to a), as the solutions have become effectively 
localized. Correspondingly, the storage capacity does not decrease further due 
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the profiles of the solutions, although it continues to decrease due to more 
clustered connectivity. The second of the two mechanisms which decrease the 
storage capacity thus reaches a maximum effect as soon as the two flanks of 
the activity profile go to zero, and the steady state of the network is a genuine 
'bump'. 

5 Simulations 

In this section we present the results of simulations that investigate the relation 
between the storage capacity and the width of the connectivity, as well as the 
emergence of non-uniform asymptotic states. Such states for very local connec- 
tivity (very low a) eventually become localized, in the sense that activity is zero 
outside a limited fraction of the ring. 

To measure the degree of uniformity of the steady states reached in each 
simulation, we define the quantity: 

= 12 J dr (r - r max f m (r) 
q N 2 Jdrm(r) ( ' 

where r max is where the local overlap m (r) has its maximum. We use a 
smoothed version of the local overlap to regularize the parameter < q < 1, 
which takes its maximum q = 1 for a uniform solution and is inversely related 
to the degree of bumpiness, or of locality, of a spatially non-uniform overlap 
distribution 




Sigma 

Figure 5: The change in the uniformity q as a function of g and a from simulating 
a network of N — 6400, C = 320. The sparsity of the patterns is a = 0.2. The 
value of p is chosen in a way that the network on average is able to retrieve 50% 
of the patterns for the given values of g and a. 
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Here we report the results of simulating a network composed of N = 6400 
units with C = 320 and a = 0.2. For each value of <7, g and p we give the network 
a full cue, corresponding to one of the stored patterns, and after 50 synchronous 
updates we measure the final overlap with the presented pattern. If the final 
overlap is larger than 0.4 we take it to be a successful retrieval. We repeat this 
for 4 different seeds for the random number generator and 5 different patterns 
and then index the performance of the network with the percent of patterns 
retrieved (according to the above criteria). We take the value of p at which the 
performance reaches 50% as a measure of the storage capacity for those values 
of g and a. By repeating for different values of g we thus assess the optimal 
storage capacity for the network, optimized across values of the gain. 

In FiglSJone can observe the way q changes as a function of g and a. As we 
found in section^ for values of a below the transition the activity is not uniform 
in space, i.e. it has a 'bumpy' profile. Increasing g favours the localization of 
the solution, which for high g and low a becomes a genuine 'bump' of activity, 
as seen in FigEJ There appears to be no sharp transition, but rather a very 
smooth cross-over, from the 'quasi-cosine' regime near o~ c to the localized 'quasi- 
gaussian' regime for low a. The cross-over is in fact regulated by the gain g. 

The second point that we have studied through the simulations is the depen- 
dence of the storage capacity on the width of the connectivity a. The storage 
capacity calculated from the simulations is the full curve in FigHJl This curve 
lies below the storage capacity calculated analytically using the uniform solu- 
tion. As previously mentioned there are two effects which contribute to the 
decrease of the storage capacity. First, the increase in the number of loops as 
a result of the decrease in the width of the connectivity, which has the effect 
discussed in section 4.2.1 and shown with the dashed curve in FigEl Second, 
the non-uniform profile of the solution, discussed in section 4.2.2. These two 
effects are not uncorrelated but one can consider them combined as if they were 
uncorrelated, in the following way. For each value of a one first calculates the 
storage capacity by considering only the effects of the loops, i.e. the dashed 
curve. Then one estimates the most appropriate width for a model localized 
solution, i.e. the best gaussian fit, by solving the following equation for I, using 
the value q obtained from the simulations: 



12 n 6 drr'exp(-r^-) 
J°'l 5 drexp(-^-) 



(32) 



Then an estimate of the storage capacity, given independent effects of loops and 
localization, would simply be the multiplication of the storage capacity calcu- 
lated for the uniform solution by the factor a(l)/a(po). This is the dotted curve 
in Fig[|)] It yields a lower estimate of the storage capacity as compared to that 
of the simulations, but yet closer to it than the uniform solution approximation. 
The assumption of uncorrelated effects thus overestimates the capacity decrease 
with lower a values. 

Note that the capacity decrease could in principle be overestimated also as 
an effect of our procedure of exploring only spatially-dependent solutions of a 
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given shape (in particular, gaussian); this effect is however likely negligible, at 
least in the localized regime where the profiles seen in the simulations are very 
close to gaussian. 
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Figure 6: In this graph, the full curve shows the storage capacity versus a for a = 
0.2, obtained numerically simulating a network with N — 6400, C = 320. The 
dashed curve is the storage capacity estimated analytically using the uniform 
approximation, as discussed in section 4.2.1. The dotted line represents the 
storage capacity estimated taking into account, as uncorrelated, both the effect 
of more loops and that of the spatial dependence of the solution, as explained 
in section 4.2.2. 



6 Discussion 

In this paper we have studied the retrieval properties of an autoassociative 
network with geometric synaptic connectivity. To approach the analysis of a 
spatially structured network we had to find alternatives to the standard ther- 
modynamic formalism often applied to systems with quenched disorder, based 
on the calculation of a free-energy with the replica trick, and on its evaluation 
at a saddle point. Even though in terms of their behaviour and even of their 
fixed point equations asymmetric and symmetric networks do not differ by much 
at least in networks with threshold-linear units for which the spin glass phase 
is irrelevant |35| . the lack of a Hamiltonian requires an alternative approach 
to obtain the mean-field equations. Fukai and Shiino developed years ago a 
'self-consistent signal-to-noise analysis' to treat cases in which a Hamiltonian 
could not be defined, in particular asymmetric networks 26 and networks with 
arbitrary analogue transfer functions 27 . The asymmetric connectivity is in 
some sense a technical problem, which might not reveal any 'new' physics in the 
threshold-linear networks. In our case, we had to face a second and more sub- 



19 



stantivc problem, as a direct result of the geometry in the connectivity. This is 
the fact that in order to describe the behavior of a geometric network, one needs 
to introduce order parameters that are not scalars but scalar fields. This makes 
the derivation of the mean-field equations and the analysis of their solutions 
much more challenging: the equations are now integral equations in these field 
order parameters (see Eg FTTH). We adapted the 'self-consistent signal-to-noise 
analysis' turning it into a local signal-to-noise approach, to find self-consistent 
equations for order parameters with the following physical meaning: 

• m(r): the local overlap, i.e. the product of the current activity and the 
stored patterns, summed over all units presynaptic to the unit at r. 

• p(r): the local noise, i.e. the mean square amplitude of the non-condensed 
overlaps [Hj, as seen at r. 

• rp(r,r'): the effect of the direct and indirect connections linking two units 
at positions r and r' on the reverberation of the noise. 

• F(r): a measure, proportional to the diagonal elements of tp(r,r'), of the 
effect of the activity of each unit on the noise component of its own input. 

Although the stored patterns do not include any spatial structure or corre- 
lation, the retrieval states of the geometric connectivity network, as simulations 
easily demonstrate, may have non-uniform activity profiles when the connec- 
tions are short range enough. The activity profile of the retrieval state can even 
become localized in space, as a 'bump' of activity, in the appropriate param- 
eter regime. It is worth emphasizing that a non-uniform retrieval state does 
not correspond in full to the stored pattern that is being retrieved, since the 
stored pattern does not have any spatial preference; the retrieval state has a 
large overlap with the pattern, but circumscribed to the bump. As shown in 
FigEl t ne bumpiness of such a non-uniform retrieval state depends not only on 
how short range is the connectivity, but also on the gain of the input-output 
transfer function. Increasing the gain favours localization in the retrieval state, 
of course as long as the network remains in the regime where retrieval occurs. 

Given that the storage capacity of the geometric network decreases with re- 
spect to that of a uniform network due to both an increase in the loops and 
the non-uniformity of the solution, we described a procedure to estimate these 
effects under the assumption that they are independent. Comparing with simu- 
lations, we found that the procedure yields a reasonable estimate of the storage 
capacity, although lower than the true value, due probably to the simplifying 
independence assumption. The bottom-line conclusion is that as the connectiv- 
ity of the autoassociative network becomes shorter range, its storage capacity, 
expressed as p/C, decreases indeed, but not by a large factor. 

The model that we have studied here is defined on a ring, i.e. it is a one 
dimensional model. Although we do not expect significant differences between 
the results reported here and those of a more interesting model in two dimen- 
sions, it remains to be checked, in future work, to what extent the results can 
be generalized to a 2D model. 
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Further, we have considered a gaussian connectivity of varying width. An 
alternative model, introduced by Watts and Strogatz , is that of a network 
with a fraction of connections strictly short range, and the complementary frac- 
tion distributed at random across the entire network. This could be perhaps 
an appropriate model for cortical connectivity, with its short- and long-range 
connections and it has already been considered, albeit in a geometry- free 
formulation, as a model of semantic memory |22M14| . It seems important to be 
able to extend our approach to a model of the Watts-Strogatz type, with ID 
or 2D geometry. It is possible that the storage capacity would decrease more 
substantially in such a model, than it does in the one considered here - a naive 
expectation is that p should scale, essentially, with the number of short range 
connections, those that comprise the so-called regular network. 

Indeed, the Watts-Strogatz autoassociative model has been studied by [5], 
even though with simulations only and a more realistic neuronal integrate-and- 
fire dynamics. One result of that study is the near incompatibility between 
localization and retrieval, which appear to occur in almost mutually exclusive 
ranges of the relevant parameter, the fraction of short-range connections. The 
fact that retrieval did not seem to succeed when asymptotic firing states are 
localized may have been due to a strongly decreased capacity, or to instabili- 
ties associated with the integrate-and-fire dynamics, or to other aspects of that 
model, such as the mechanism implementing inhibitory control of the activity 
of excitatory units. Whatever the case, this emphasizes the need for studies of 
associative retrieval and localization in realistic neural network models. Impor- 
tant features that need consideration are the dynamics of individual units and 
the model adopted for inhibitory effects. 

Once thus extended, this approach has the potential to yield a quantita- 
tive assessment of the storage capacity of cortical modules in the mammalian 
brain. Such modules obviously differ from our simplified mathematical models 
in several respects, but an educated guess predicts that the crucial factor that 
determines their storage capacity is the number and geometrical distribution of 
the Hcbbian-modifiable connections that the average unit receives. 

It becomes possible at that stage to consider in a realistic setting a phe- 
nomenon which had been considered earlier in rather abstract models: the co- 
existence of multiple retrieval states. So-called 'spurious' solutions have been 
shown not to be stable, essentially, in threshold-linear networks without geome- 
try PI] , but in a simple ring model |5J different stored patterns can be retrieved 
in different portions of the ring. Thus the underlying geometrical manifold can 
turn spurious mixtures of patterns into interesting combinations of localized 
patterns, and this raises the issue of their competitive interactions. Recent 
neurophysiological work [23] demonstrates that the receptive fields of visually 
evoked activity patterns are effectively restricted 2- to 3-fold, in the macaque 
temporal cortex, when several objects are present together in the visual scene. 
This finding may have a correlate in an associative memory network with ge- 
ometry, where long-range inhibition may restrict the width of activity profiles 
retrieved on different portions of the manifold. 
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Appendix. I 

Using Ea ll2l the solution for can be written as: 
where R^ is expanded in a power series as: 

fl £ = c ^ + E + E ^« c « + ■ • ■ (!- 2 ) 
i u 

and we have used the notation G^[i\ — G[(r]j/a — l)m| + p^z + b(x) — T t hr\- 

Now that we have expressed the local overlap with a non-retrieved pattern 
as a function of m], we can proceed to evaluate p and 7: 

E«-1K = ^E^W/a-l) 2 G^] (1-3) 

+ ^ E RiM^-^M^-wm- 



For the first sum in the r.h.s of Eg 11-31 above, using the independence of 
different patterns and assuming that pf ~ pi one can write: 



^E^^A 1 - 1 ) 2 ^] = *{R»M/a-lfG v \i]) (1-4) 

~ «(^(ryr/fl-l) 2 >^ 

and as a result we identify: 

7i = a< R»M/a - If >= aT (R^), (1-5) 
where we denote as in [32] To = 1/a — 1; and therefore: 

r s: = aT «J%) - c u ) = aTo(E + E K il K ^tj + •••)• P-6) 

The second term is a bit tricky. For this term, by replacing the sum with 
the average we get zero mean, but for the standard deviation we have: 

P 2 = g(l/a-l)E(^- 2 (^7«- l) 2 ^'] 2 ) (1-7) 
3 

which is, actually, the standard deviation of the noise. We can then replace 
the second term, that is the noise term, with a gaussian random variable with 
mean zero and standard deviation p, and take it into account in our mean- 
field equations by averaging the equations over this gaussian measure. We shall 
discuss the reliability of this assumption soon. 
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In order to make the equations more comprehensive, we define the order 
parameter iji as an average over non- retrieved patterns of where: 

= K ~ Cij = ]T K u c U + E K u K fai + ■■■ (1-8) 
I It 

By using this definition with some algebraic manipulation we get the mean-field 
equations. 

Considering the noise term as a gaussian random variable comes from rewrit- 
ing the left hand side of Eq|7| in its original form and then approximating it as 
a sum of random independent variables: 

p t z + 7^ = — °ij(Vi/a- l)(Vj/a- (1-9) 

The assumption of independence of these terms is a bit tricky, since in general 
when following the dynamics of the system the activity of each unit Vi at a given 
time depends on the local field at the previous time step, hence on the patterns. 
However, it becomes an appropriate approximation when the first pattern is 
thoroughly retrieved i.e. Wj = 77*. In this case the terms that appear in the 
noise sum are in fact close to independent random variables, as the patterns 
are, by construction. The gaussian noise assumption is thus accurate in the 
limit in which the solutions of the mean-field equations are exactly equal to the 
patterns used in generating the weight matrix, i.e. in the case of retrieval 
without errors. This is of course never strictly the case, unless the storage 
capacity is zero in the thermodynamic limit 0. However, if we assume that the 
retrieved state in a successful retrieval is very close to a stored pattern and has 
a large overlap with it, then the gaussian approximation is reasonable. In other 
words the gaussian noise is appropriate when retrieval really occurs. 

It is important to note that one should be careful in using the gaussian 
approximation when dealing with the dynamics of the network. This approxi- 
mation may just qualitatively predict the recall dynamics when retrieval occurs 
(as well as the stationary state phase diagram) but it fails to describe the non- 
retrieval trajectories. This is simply due to the fact that the noise distribution 
can be approximated by a gaussian when retrieval is successful, as explained 
above, but it could be non-gaussian in the intermediate states before getting 
close to the retrieval attractor. Starting from an initial state which for instance 
has a non-zero overlap with two patterns, the noise does not follow in general 
a gaussian distribution. It approaches a gaussian form as the network evolves 
toward one of the attractors, corresponding to one of the patterns. The gaussian 
approximation may also give the correct dynamical equations for the fist few 
time steps, but then it may give results different from the exact solutions later 
on. For a detailed discussion on this issue see m |5] . 

As we shall see later in the limit of a structure-less network the mean-field 
equations from the gaussian approximation are identical to those found previ- 
ously using the replica method. This confirms that the equations that result 
from the gaussian approximation in the signal-to-noise analysis are in terms of 
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accuracy in the same line as the equations derived with the replica method. Ac- 
tually the close relation between the signal-to-noise analysis, the replica method 
and the TAP equation has been investigated in a recent paper by Shiino and 
Yamana 1281. 



Appendix. II 

Dividing by the gain factor the equations for p and m one gets: 
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On the other hand using the equation for tp and the definition of f2 in Ea ll7l we 
have: 

and we find: 

n= (J + Dz)/A 2 . (II-l) 

Combining the equations above one gets to Eqs|2] and 03 The second of 
those equations determines the optimal value of g, but it does not change the 
storage capacity. 
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where we have set r = mo/ po and w = [b (x) — tuq — T t h r ]/po- 
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